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Abstract 

Low-rank matrix completion (LRMC) problems arise in a wide variety 
of applications. Previous theory mainly provides conditions for comple¬ 
tion under missing-at-random samplings. This paper studies deterministic 
conditions for completion. An incomplete dx N matrix is finitely rank-r 
completable if there are at most finitely many rank-r matrices that agree 
with all its observed entries. Finite completability is the tipping point in 
LRMC, as a few additional samples of a finitely completable matrix guar¬ 
antee its unique completability. The main contribution of this paper is 
a deterministic sampling condition for finite completablility. We use this 
to also derive deterministic sampling conditions for unique completability 
that can be efficiently verified. We also show that under uniform random 
sampling schemes, these conditions are satisfied with high probability if 
0(max{r, log d}) entries per column are observed. These findings have 
several implications on LRMC regarding lower bounds, sample and com¬ 
putational complexity, the role of coherence, adaptive settings and the 
validation of any completion algorithm. We complement our theoretical 
results with experiments that support our findings and motivate future 
analysis of uncharted sampling regimes. 


1 Introduction 

Low-rank matrix completion (LRMC) has attracted a lot of attention in recent 
years because of its broad range of applications, e.g., recommender systems and 
collaborative filtering [1] and image processing [2]. 

The problem entails exactly recovering all the entries in a dx A rank-r matrix, 
given only a subset of its entries. LRMC is usually studied under a missing-at- 
random and bounded-coherence model. Under this model, necessary and suffi¬ 
cient conditions for perfect recovery are known [3-8]. Other approaches require 
additional coherence and spectral gap conditions [9], use rigidity theory [10], 
algebraic geometry and matroid theory [11] to derive necessary and sufficient 
conditions for completion of deterministic samplings, but a characterization of 
completable sampling patterns remains an important open question. 
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We say an incomplete matrix is finitely rank-r completable if there exist 
at most finitely many rank-r matrices that agree with all its observed entries. 
There exist sampling/observation patterns that guarantee finite completablility, 
but if just a single one of the observed entries is instead missing, then there are 
infinitely many completions. Conversely, adding a few observations to such a 
pattern guarantees unique completability. Thus, finite completablility is the 
tipping point in LRMC. 

Whether a matrix is finitely completable depends on which entries are ob¬ 
served. Yet no characterization of the sets of observed entries that allow or 
prevent finite completablility is known. 

The main result of this paper is a sampling condition for finite completablil¬ 
ity, that is, a condition on the observed entries of a matrix to guarantee that 
it can be completed in at most finitely many ways. In addition, we provide de¬ 
terministic sampling conditions for unique completability that can be efficiently 
verified. Finally, we show that uniform random samplings with 0(max{r, log d}) 
entries per column satisfy these conditions with high probability. 

Our results have implications on LRMC regarding lower bounds, sample and 
computational complexity, the role of coherence, adaptive settings and valida¬ 
tion conditions to verify the output of any completion algorithm. We comple¬ 
ment our theoretical results with experiments that support our findings and 
motivate future analysis of uncharted sampling regimes. 

Organization of the Paper 

In Section 2 we formally state the problem and our main results. In Section 3 
we discuss their implications in the context of previous work, and present our 
experiments. We present the proof of our main theorem in Section 4, and we 
leave the proofs of our other statements to Sections 5 and 6. 

2 Model and Main Results 

Let Xf 2 denote the incomplete version of a dxN , rank-r data matrix X, observed 
only in the nonzero locations of a dx N matrix with binary entries. The goal 
of LRMC is to recover X from Xo . 

This problem is tantamount to identifying the r-dimensional subspace S* 
spanned by the columns in X, and this is how we will approach it. First observe 
that since X is rank-r, a column with fewer than r samples cannot be completed. 
A column with exactly r observations can be uniquely completed once S* is 
known, but it provides no information to identify S*. We will thus assume that: 


A1 Every column of X is observed on exactly r + 1 entries. 
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The key insight of the paper is that observing r + 1 entries in a column 
of X places one constraint on what S* may be. For example, if we observe 
r + 1 entries of a particular column, then not all r-dimensional subspaces will 
be consistent with the entries. If we observe more columns with r + 1 entries, 
then even fewer subspaces will be consistent with them. In effect, each column 
with r + 1 observations places one constraint that an r-dimensional subspace 
must satisfy in order to be consistent with the observations. The observed 
entries in different columns may or may not produce redundant constraints. 
As we will see, the pattern of observed entries determines whether or not the 
constraints are redundant, thus indicating the number of subspaces that satisfy 
them. The main result of this paper is a simple condition on the pattern of 
observed entries that guarantees that only a finite number of subspaces satisfies 
all the constraints. This in turn provides a simple condition for exact matrix 
completion. 

Remark 1. We point out that any observation, in addition to the r + 1 per 
column that we assume, cannot increase the number of rank-r matrices that 
agree with the observations. So in general, if some columns of X are observed 
on more than r + 1 entries, all we need is that the observed entries include 
a pattern with exactly r + 1 observations per column satisfying our sampling 
conditions. 

Also notice that completing X is the same as completing X T , so a row with 
fewer than r observations cannot be completed. While we do not assume that 
each row is observed on at least r entries, our sampling conditions guarantee 
that this is the case. 

Let Gr(r,R d ) denote the Grassmannian manifold of r-dimensional subspaces 
in R d . Observe that each dx N rank -r matrix X can be uniquely represented in 
terms of a subspace S* e Gr(r, R d ) (spanning the columns of X) and an r x N 
coefficient matrix ©*. See Figure 1 to build some intuition. Let v G denote the 
uniform measure on Gr(r,R d ), and vq the Lebesgue measure on W' xN . Our 
statements hold for almost every (a.e.) X with respect to the product measure 
v G x VQ. 

The paper’s main result is the following theorem, which gives a deterministic 
sampling condition to guarantee that at most a finite number of r-dimensional 
subspaces are consistent with Xd. 

Given a matrix, let ?r(*) denote its number of columns and m(-) the number 
of its nonzero rows. 


Theorem 1. Let Ft be given, and suppose A1 holds. For almost every X, 
there exist at most finitely many rank-r completions of Xq if and only if 
there exists a matrix fl formed with r(d-r) columns of Ft, such that 

(i) Every matrix Ft 1 formed with a subset of the columns in Ft satisfies 

m{Fl') > n(Fl')/r + r. (1) 
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Figure 1: Each column in a rank-r matrix X corresponds to a point in an r-dimensional 
subspace S*. In these figures, S* is a 2-dimensional subspace (plane) in general position. In 
the left, the columns of X are in general position inside 5*, that is, drawn independently 
according to an absolutely continuous distribution with respect to the Lebesgue measure on 
S*, for example, according to a gaussian distribution on S*. In this case, the probability of 
observing a sample as in the right, where all columns lie in a line inside S*, is zero. Our 
results hold for every rank-r matrix, except for a set of measure zero of pathological cases as 
in the right. 


The proof of Theorem 1 is given in Section 4. In words, condition (i) asks 
that every subset of n columns of ft has at least n/r + r nonzero rows. 


Example 1. Suppose ft is given by: 


n = 


i 

i 


i 

i 


i 

i 


}r 


d-r, 


(r + l)(d-r) 

such that Cl has exactly r + 1 nonzero entries per column. This way, each column 
of fZ encodes exactly one constraint that candidate subspaces must satisfy in 
order to be consistent with the observed data. In this case we can simply take 
Cl to be the matrix formed with the first r(d-r) columns of Cl. One may verify 
that Cl satisfies (i). Hence Cl satisfies the conditions of Theorem 1. 

Unique Completability 

Theorem 1 is easily extended to a condition on Cl that is sufficient to guarantee 
that one and only one subspace is consistent with Xo, which in turn suffices for 
exact matrix completion. 


Theorem 2. Let Cl be given, and suppose A1 holds. Then almost every X 
can be uniquely recovered from Xfj if Cl contains two disjoint submatrices: 
Cl of size d x r(d-r) and Cl of size dx (d-r), such that fZ satisfies (i) and 

(ii) Every matrix Cl' formed with a subset of the columns in Cl satisfies 

m(Cl’) > n(Cl') + r. (2) 
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The proof of Theorem 2 is given in Section 5. In words, condition (ii) asks 
that every subset of n columns of Cl has at least n + r nonzero rows. Notice that 
(1) is a weaker condition than (2), but (1) is required to hold for all the subsets 
of r(d-r) columns, while (2) is required to hold only for all the subsets of d — r 
columns. 

Example 2. Consider fl as in Example 1. Take fl to be the matrix formed 
with the first r(d-r) columns of Cl and Cl to be the matrix formed with the last 
d-r columns of Cl. One may verify that $7 satisfies (i) and that satisfies (ii). 
Hence Cl satisfies the conditions of Theorem 2. 

Theorem 1 implies that r(d - r) columns with r + 1 entries are necessary for 
finite completablility (hence also for unique conrpletability). There are cases 
when r(d-r) columns are also sufficient for unique completability, e.g., if r = 1, 
where finite completablility is equivalent to unique completability (see Proposi¬ 
tion 1). 

In general, though, unique completability requires more columns than finite 
completablility (see Example 5). Theorem 2 gives deterministic sufficient sam¬ 
pling conditions for unique completability that only require (r+l)(d-?’) columns. 
This shows that with just a few more observations, unique completability follows 
from finite completablility. 

We point out that when the conditions of Theorem 2 are met, S* can be 
uniquely identified as 


S* = span 


where V is the unique solution to the polynomial system T(V) = 0, with T as 
defined in Section 4. 

Once S* is known, X can be perfectly recovered observing only r entries per 
column. To see this, let U* be a basis of S*, and let v be a subset of {1,..., d) 
with exactly r elements. We will use the subscript v to denote restriction to 
the rows in v. Since the coefficients of column x in the basis U are given by 
r = (u; T u;)- 1 u; T x„, we can recover the entire column as x = U*0*. 


About Conditions (i) and (ii) 

In general, verifying condition (i) in Theorems 1 and 2 may be computationally 
prohibitive, especially for large d. On the other hand, one can easily and effi¬ 
ciently verify whether (ii) is satisfied by checking the dimension of the null-space 
of a sparse matrix (Algorithm 1). Fortunately, there is a tight relation between 
conditions (i) and (ii), summarized in the following lemma. The proofs of the 
statements in this section are given in Section 6. 

Lemma 1. Let Cl be a dxr(d-r) matrix formed with a subset of the columns in 
Cl. Supposed can be partitioned into r matrices {Cl T } r T = 1 , each of size dx(d-r), 
such that (ii) holds for every Cl T . Then d satisfies (i). 
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As consequence of Lemma 1 we obtain an additional sufficient condition for 
completability that only involves (ii). 

Corollary 1. Let ft be given, and suppose A1 holds. Then almost every X can 
be uniquely recovered from Xn if 77 contains r + 1 disjoint matrices {f7 T }(A i? 
each of size d x (d - r), such that (ii) holds for every ft T . 

Example 3. Consider ft as in Example 1. We can partition $7 into 
[ Oi | O 2 | ••• | f7 r+ i], as depicted in Example 1. One may verify that fl T 
satisfies (ii) for every r=l,...,r+l. Hence ft satisfies the conditions of Corol¬ 
lary 1. 

With Corollary 1 we show that computable patterns appear with high proba¬ 
bility under uniform random sampling schemes with as little as 0(max{r, log d}) 
samples per column. 

Theorem 3. Let 0 < e < 1 be given. Suppose r < | and that each column ofX is 
observed in at least l entries , distributed uniformly at random and independently 
across columns, with 


i > max{l2(log(f) + l), 2r} . (3) 

Then with probability at least 1 - e, Xo will be finitely rank-r computable (if 
N>r(d-r)) and uniquely computable (if N > (r + l)(d - r)). 

In many situations, though, sampling is not uniform. For instance, in vi¬ 
sion, occlusion of objects can produce missing data in very non-uniform random 
patterns. In cases like this, we can partition ft (e.g., randomly) into matrices 
each with d-r columns. We can use Algorithm 1 below to determine 
whether each ft T satisfies (ii). If this is the case, ft is computable by Corollary 
1. More about this is discussed in Section 3. 

To present the algorithm, let us introduce the matrix A that will allow us 
to determine efficiently whether a sampling $7 satisfies (ii). Let $7 be a matrix 
formed with N > d-r columns of ft, and let u>i index the nonzero entries in the 
i th column of ft. Let U be a d x r matrix drawn according to an absolutely 
continuous distribution with respect to the Lebesgue measure on K dxr , and let 
U u , denote the restriction of U to the nonzero rows in Let el" 1 be a 
nonzero vector in kerU 7 ., and a; be the vector in with the entries of a u , i in 
the nonzero locations of u>i and zeros elsewhere. Finally, let A denote the d x TV 
matrix with {a;}A x as columns. 

Algorithm 1 will verify whether dimkerA 7 = r, and this will determine 
whether ft contains a dx(d-r) matrix $7 satisfying (ii). The key insight behind 
Algorithm 1 is that A encodes the information of the projections of S = spanjU) 
onto the canonical coordinates indicated by ft. Theorem 1 in [21] shows that 
these projections will uniquely determine S if and only if dirnker A T = r, which 
will be the case if and only if ft contains a d x (d - r) matrix ft satisfying 
(ii). We thus have the following corollary, which states that with probability 1, 
Algorithm 1 will determine whether J7 contains a matrix $7 satisfying (ii). 


6 



Algorithm 1: Determine whether $7 contains a matrix 17 satisfying (ii). 

Input: Matrix 17 with N > d-r columns of 17. 

- Draw U e ! rf * r according to u\j. 

- for i = 1 to N do 

- LO t = indices of the nonzero rows of the ?' th column of Cl. 

- &u>i = nonzero vector in kerU^ . 

- a i = vector in with entries of a U!i in the 

nonzero locations of u, and zeros elsewhere. 

- A = matrix formed with {a,;} 1 as columns. 

- if dimker A t = r then 

| - Output: 17 contains a dx (d-r) matrix 17 satisfying (ii). 

- else 

|_ - Output: 17 contains no d x (d-?-) matrix 17 satisfying (ii). 


Corollary 2. Let Cl be a matrix formed with N > d-r columns of Cl. Construct 
A as in Algorithm 1. Then iAj-almost surely, Cl contains a dx(d-r) matrix Cl 
satisfying (ii) if and only */dimker A T = r. 

Algorithm 1 can also be used to design conrpletable samplings. As will be 
discussed in Section 3, this can be particularly useful for adaptive settings, where 
one may choose which entries to observe, yet it is undesirable or impossible to 
observe full columns or full rows. 

Example 4. One may use Algorithm 1 to verify that each of the r + 1 blocks 
in the sampling matrix 17 below satisfies (ii). This implies that 17 satisfies the 
conditions of Corollary 1, and can thus be uniquely completed. In contrast with 
Example 1, this pattern does not sample full columns nor full rows. 



3 Experiments and Implications 

In this section we discuss implications of the results stated above and explore 
how well they predict performance in a series of simulation experiments. In all 
our experiments we use the so-called iterative lrard-thresholded SVD (IHTSVD) 
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algorithm [12]. This algorithm iterates between truncating the SVD of the 
current estimate to a user-specified rank r, and then replacing the values in the 
observed entries with their original (observed) values. This algorithm is also 
quite similar to the Singular Value Thresholding algorithm [13], OptSpace [14] 
and FPCA [15]. In the very low sampling regimes of interest in our studies, 
we found the IHTSVD algorithm typically performed as well or better than 
several other completion algorithms (e.g., SVT [13], GROUSE [16], alternating 
minimization [17] and EM [18]). 

Lower bound 

It is easy to see that l = O(max{r,logd}) uniformly randomly sampled entries 
per column are necessary to complete an 0(d) x 0(d) matrix. This is because 
a column with fewer than r observed entries cannot be completed, and if fewer 
than O(logd) uniformly random samples per column are observed, then a row 
may be completely unobserved with large probability, making it impossible to 
complete a matrix. Thus, t - 0(max{r, logd}) is a lower bound for LRMC. 

It was further shown [4] that there exist matrices that cannot be completed 
unless £ = 0(/ir\ogd) uniformly randomly sampled entries per column are ob¬ 
served, where n e [1, ([] is the standard coherence parameter defined as 

/i := ^maxi<j< d |P*ej-|||, 

where P* denotes the projection operator onto S*, and ej the j th canonical 
vector in R d . Our results imply that this is only the case for a set of matrices 
with measure zero, and that a.e. matrix can be uniquely completed with as 
little as £ = O(max{r,logd}) uniformly randomly sampled samples per column, 
regardless of fi. 

To better understand this, and see that our results do not contradict previous 
theory, let us revisit the proof of Theorem 1.7 in [4]. The proof is based on the 
construction of block-diagonal matrices with blocks of size and coherence 
< p that cannot be recovered with fewer than £ = 0(/i?Tog d) uniformly random 
samples per column, e.g., 



d 

r/i 


U 


(4) 


This is so because zero valued entries provide no information for the reconstruc¬ 
tion process. It follows that the larger /x, the smaller the blocks will be, and 
more intensive random sampling would be required to guarantee that entries in 



t 


0(r log d ) 

0(max{r, log d}) 
r + 1 

^ r d r(d — r) (r+l)(d — r) 

Figure 2: Theoretical sampling regimes of LRMC. In the white region, where the dashed 
line is given by t = + r, it is easy to see that LRMC is impossible by a simple count 

of the degrees of freedom in a subspace (see Section 4). In the light-gray region, LRMC is 
possible provided the entries are observed in the right places, e.g., satisfying the conditions of 
Theorem 2. By Theorem 3, uniform random samplings will satisfy these conditions with high 
probability as long as N > (r + 1 )(d - r) and t > max{12(log(^) + 1), 2r}, hence with high 
probability LRMC is possible in the dark-grey region. Previous analyses showed that LRMC 
is possible from uniform random sampling in the striped region [3], but the rest remained 
unclear until now. 



the diagonal blocks are observed. This is why more samples (O(r/xlog d) per 
column) are required to reconstruct more coherent matrices like this one , and 
hence the dependency on r/i in the bound of Theorem 1.7 in [4]. 

However, matrices with this block structure have measure zero (with respect 
to the measure defined above). Our results show that for a.e. matrix, an in¬ 
complete column contains the same exploitable information regardless of the 
coherence parameter, and O(max{r,logd}) uniform random entries per column 
are sufficient for completion. This means that while there are some matrices 
that require O(r/j,logd) uniform random samples per column for reconstruction, 
a.e. matrix only requires 0(nrax{r, log d}), regardless of /.i. 

Sample Complexity 

Coherence aside, it is also known that N = d columns, and t = 0(r logd) uniform 
random samples per column are sufficient for completion [3]. Theorem 3 extends 
this result, showing that N = (r + l)(d-r) columns and l = 0(max{r,logd}) 
uniform random samples per column are sufficient to uniquely complete a.e. 
matrix. This exposes an interesting tradeoff between the required number of 
columns and observed entries per column for completion, defining new unstudied 
sampling regimes where completion is now known to be possible (Figure 2). 

The purpose of our first experiment is to support that l - 0(max{r, logd}) 
random samples per column are truly sufficient for LRMC, as opposed to 0(r log d ) 
To this end, we will study the behavior of the IHTSVD algorithm as a function 
of the ambient dimension d and the rank r (see the beginning of Section 3 for a 
discussion of this algorithmic choice). 

To obtain low-rank matrices, we first generated a d x r random matrix U* 
with ZNT(0,1) i.i.d. entries to use as basis of S*. We then generated an r x (r + 

1 )(d-r) random matrix ©*, also with N(0,1) i.i.d. entries, to use as coefficient 
vectors, to construct X = U*@*. Matrices generated this way are known to 
have low coherence. 
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Figure 3: Results of the IHTSVD algorithm as a function of the ambient dimension d and the 
number of uniform random samples per column l, for rank r =7. In each of the 2,000 trials 
we declared a success if the normalized completion error was below 10 12 (using normalized 
Frobenius norm). The black line represents the linear discriminant between success and failure 
trials. 


Next, for different values of the rank r, we tested whether a matrix could be 
completed as a function of its ambient dimension d and the number of uniform 
random samples per column £. For example, the results of this experiment for 
r = 7 can be seen in Figure 3. 

We then computed the linear discriminant between successful and failure 
trials for each value of r. If £ = 0(r log d) samples were necessary, we would 
expect the slope between these lines to grow proportionally to r. However, the 
results, depicted in Figure 4, show that the slope of these lines remain fairly 
constant, and the offset grows with r, supporting that i - 0(max{r,logd}) 
samples are sufficient. 

Computational Complexity 

Our results show that completion is theoretically possible with as little as with 
t > 0(max{r,logd}) uniform random samples per column, or even with as little 
as £ - r+1 (provided they are located in the right places). Nevertheless, this may 
involve solving the system of polynomial equations IF = 0 (see Section 4), which 
is computationally impractical. It is thus currently unknown whether there exist 
practical completion algorithms for these uncharted sampling regimes. 

We now present a series of experiments that suggest three things: first, that 
even in cases where LRMC is theoretically possible, missingness seems to come 
at a price: the more missing data the more computationally expensive com¬ 
pletion seems to be. This further suggests that there is a minimal sampling 
regime where, though theoretically possible, LRMC might be computationally 
prohibitive in practice. Second, that even though theoretically, whether a.e. ma¬ 
trix can be completed does not depend on its coherence, in practice, extremely 
coherent matrices may be computationally more expensive to complete. Sim- 






logd 


Figure 4: Linear discriminants for different values of the rank r, between successful (above 
line) and unsuccessful (below line) completions for the experiment in Figure 3. That is, for 
a given r, any pair (log d. £) above the linear discriminant typically succeeds at completion, 
and below the linear discriminant typically fails. Theorem 3 shows that l = Ofiria.x {r, log d )) 
uniform random observations per column are sufficient for completion. The slope of these 
lines remain fairly constant, and the offset grows with r, supporting this result. 


ilarly, this suggests that there is a maximal coherence regime where, though 
theoretically possible, LRMC might be computationally prohibitive in practice. 
And third, there seems to be an additional uncharted sampling regime with 
£ < 0(/ir log d) samples per column where completion is computationally feasi¬ 
ble. 

To summarize, we have the following sampling regimes, where LRMC is: 

possible, but apparerjblyssible, and apparently 
impossible computationally computationally well-studied 
prohibitive , feasible 

-1-1-1- e 

r+ 1 ? 0 {fir log d) 

We first study the computational cost of missing data. To this end we 
computed the minimum number of iterations required to complete a matrix, as 
a function of the number of uniform random samples per column l. The results 
are summarized in Figure 5. Unsurprisingly, the more missing data, the more 
iterations are required to complete the matrix. 

In addition, we constructed samplings Cl with only l - r + 1 samples per 
column selected uniformly at random, and kept only those samplings satisfying 
the conditions of Corollary 1, to guarantee that Xfj were uniquely completable 
(we used Algorithm 1 to determine whether each sampling satisfied these con¬ 
ditions). Unfortunately, even though Xf 2 was uniquely completable, the matrix 
was incorrectly completed in every single trial. This suggests that completion in 
this regime, now known to be theoretically possible (through the solution of the 
polynomial system T = 0; see Section 4), might be computationally prohibitive 
in practice. 

We thus tested how much missing data can practical algorithms handle while 
remaining computationally efficient. To this end, we sampled £ < fir log d entries 
per column, drawn uniformly at random, and ran the IHTSVD algorithm for 
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Figure 5: Average number of iterations (over 500 trials) required by IHTSVD to complete 
a matrix with low coherence (/x < 3) with an accuracy of 10 -12 (using normalized Frobenius 
norm), as a function of p := l/d, the proportion of uniform random samples per column, with 
ambient dimension d = 500 and rank r = 10. 



Figure 6: Average completion error of IHTSVD (over 500 trials) for different levels of additive 
i.i.d. zero-mean Gaussian noise (noise variance as indicated in the legend), after at most 250 
iterations, as a function of p := t/d, the proportion of uniform random samples per column, with 
ambient dimension d = 500 and rank r = 10. Previous guarantees would require all entries to 
be observed, and so in practice, existing theory would not allow one to confirm the correctness 
of a completion. Our results do. The dashed line represents p = max{12(log(^) + 1),2r}/d, 
with e = i, the sufficient condition of Theorem 3, which implies that with probability at least 
1 — €, for any p above this threshold, a rank-r completion is guaranteed to be correct, regardless 
of the completion method. 


at most T = d iterations (see the beginning of Section 3 for a discussion of this 
algorithmic choice). 

To truly test this regime, we considered a setup where previous theory would 
require all entries to be observed to guarantee a correct completion with prob¬ 
ability at least 1 - e. There are plenty of such scenarios. We arbitrarily selected 
d = 500 and r = 10, and e = 1/d. 

Our simulations, summarized in Figure 6, show that practical algorithms 
tend to work consistently well with l < /x?Tog(^). This suggests that there is 
a regime with t < /rrlog(^) samples per column where completion is computa¬ 
tionally feasible, even in the presence of noise. 
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Figure 7: Results of the IHTSVD algorithm as a function of the coherence parameter p e 
[1, -] and the proportion of uniform random samples per column p := t/d, with ambient 
dimension d = 100 and rank r = 5. Similar results were observed for other algorithms, including 
alternating minimization [17] and EM [18]. In each of the 5,000 trials we declared a success 
if the normalized completion error was below 10 -12 (using normalized Frobenius norm). Our 
theoretical results show that whether a.e. matrix can be uniquely completed does not depend 
on its coherence. This experiment suggests that in practice, this is also the case for most of 
the range of p. For instance, given p, the success rate of this algorithm is about the same for 
most of the range of p (about 1 < p < 17). Nevertheless, the success rate quickly decays if the 
coherence is extremely high (p close to the maximum possible, ^ = 20). 


Dependence on Coherence Parameter 

In our next experiment, we study the practical role of coherence in LRMC. 
More precisely, we tested whether a matrix could be computationally efficiently 
completed as a function of its coherence parameter /x, and the number of uniform 
random samples per column £ (to generate matrices with a specific coherence 
parameter, we simply increased the magnitude of a few entries in U*, until it had 
the desired coherence). The results, summarized in Figure 7, suggest that for 
most of the coherence range, whether this algorithm can correctly complete the 
matrix mainly depends on the number of samples rather than on the coherence 
parameter. For instance, see in Figure 7 that given the number of samples, 
the success rate of this algorithm is about the same for most of the range 
of /i. Nonetheless, there are some cases with extremely large coherence (/x 
close to the maximum possible, ((, corresponding to subspaces almost perfectly 
aligned with the canonical axes), where this algorithm tends to fail more often 
at reconstructing the matrix (these are cases where most of the information is 
concentrated in only a few entries, which brings computational and numerical 
accuracy problems). 

To further study the role of coherence in practice, we recorded the number 
of iterations that were required to complete each matrix (in the success cases 
of the previous experiment). The results, summarized in Figure 8, suggest 
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Figure 8: Average number of iterations (of the success trials from Figure 7) required by 
IHTSVD to complete a matrix with an accuracy of 10 12 (using normalized Frobenius norm), 
as a function of its coherence parameter //. This suggests the existence of a maximal coherence 
regime (e.g., after the dashed line) where, though theoretically possible, completion may 
become computationally impractical. 


that while coherent matrices may be theoretically as completable as incoherent 
ones, in practice, the more coherent a matrix is, the more computationally 
expensive it may be to complete it. Furthermore, the number of iterations 
seems to increase steadily for most of the coherence range, but after a transition 
point it suddenly seems to grow exponentially, suggesting the existence of a 
maximal coherence regime where, though theoretically possible, completion may 
be computationally impractical (similar to the minimal sampling regime from 
our previous experiment). 

New Guarantees 

It is known that 0 (fir log d ) uniform random samples per column (with constants 
greater than 1) are sufficient for completion [3]. There are non-pathological 
regimes (e.g., d = 500 and r = 10, or d = 100 and r = 5, and ideal coherence, as 
in our experiments) where these conditions end up requiring that all entries are 
observed. Experiments show that the IHTSVD algorithm can exactly complete 
such matrices when even fewer than half of the entries are observed, but prior 
theory gives no guarantees in these regimes, and so in practice, one would be 
unable to confirm the correctness of a completion. 

Furthermore, typical conditions for LRMC usually apply to matrices with 
bounded coherence, and require uniform random sampling with rates that de¬ 
pend on the coherence parameter /.t. In many practical applications, sampling is 
hardly uniform (e.g., vision, where occlusion of objects produce missing data in 
very non-uniform random patterns), and p. is typically unknown, so the existing 
theory does not allow one to confirm the correctness of a completion. 

Our results shed new light on these issues. Theorem 2 states that regard¬ 
less of coherence and the sampling model, if the observation pattern satisfies 
the conditions of the theorem, a rank-r completion, obtained by any method 
whatsoever, is guaranteed to be the correct completion. In particular, Theorem 
3 states that this will be the case with high probability under uniform random 
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Figure 9: We generated dxN matrices A with only r+1 samples per column, selected uniformly 
at random, with d = 100, r = 5. This figure shows the proportion of times (over 500 trials) 
that A contains a dx(d-r) matrix A satisfying (ii), as a function of N. We used Algorithm 
1 to determine whether this was the case. Notice that as N grows, the probability that each 
A contains an A T satisfying (ii) quickly approaches 1. 


sampling models. 

In some cases one can use Corollary 1 together with Algorithm 1 to verify 
efficiently and deterministically whether these conditions are satisfied. Recall 
that Corollary 1 states that unique completability is possible if fl contains r + 1 
disjoint matrices {Cl T }^t i, each of size d x (d-r) satisfying (ii). Given a matrix 
fl T , Algorithm 1 allows to verify whether it satisfies (ii). However, it provides 
no means to select the n r ’s. 

In general, one can construct samplings for which finding the right fl T 's 
would require exponential time. However, if the samples are well spread across 
the rows, then one may validate a completion deterministically by selecting the 
fir's randomly. 

To see this, suppose fl has (r +1 )N columns, with N >d-r and exactly r + 1 
observations per column (see Remark 1). We can randomly partition fl into 
r+1 disjoint submatrices {f2r}+t i> each of size N. One can then use Algorithm 
1 to verify whether each fl T contains an dx (d-r) submatrix fl T satisfying (ii). 
If this is the case, then we know deterministically that the completion is correct. 

Figure 9 shows that as N grows, the probability that each fl T contains an 
fl T satisfying (ii) quickly approaches 1. For example, with N as small as 2 (d-r), 
i.e., with only twice as many columns as strictly necessary, each fl T will contain 
an fl T satisfying (ii) with probability larger than .999. This suggests that if we 
find a low-rank completion of a matrix, we can expect that a random partition 
will certify it through Corollary 1 and Algorithm 1. 

This way, our results can be used to certify the correctness of a comple¬ 
tion, thus bringing guarantees applicable to any algorithm, under any sampling 
model, in lieu of coherence assumptions. 

Adaptive Sampling 

If one could select which entries of X to observe, perhaps the easiest way to 
recover X is to sample r linearly independent columns to obtain a basis of 
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the subspace, and then r rows to obtain the coefficients of each column in 
this basis. However, in many LRMC applications, the entries one may observe 
can be limited. Take for example recommender systems, where obtaining a 
complete column equates to asking a single user (column) to evaluate every 
item (row). In these problems the number or rows can be very large, hence this 
can be an unreasonable thing to ask. Moreover, the combinations of rows that 
one may sample could be restricted. An other example arises in distributed 
settings, where at each location one may only sample certain subsets of all the 
information. 

Our results tell us exactly which entries to look for. Furthermore, it is fairly 
simple to construct sampling patterns that satisfy the conditions of Theorems 
1 and 2 and Corollary 1 that do not require to sample full columns or rows. For 
instance, we can generate random samplings, use Algorithm 1 to verify whether 
they satisfy condition (ii) (most of them will; see Figure 9), and keep them or 
discard them depending on this. 

Deterministic constructions are also possible. For instance, it is easy to verify 
that each of the blocks in Example 4 satisfies (ii), which implies ft satisfies the 
conditions of Corollary 1, and can thus be uniquely completed. This example 
corresponds to asking the i th user of each block to rate items i through i + r. We 
conclude that if the entries one may choose to observe are limited (as is the case 
in many LRMC applications), one can directly apply our results to adaptively 
design observation patterns that guarantee completability. 

4 Proof of Theorem 1 

For any subspace, matrix or vector that is compatible with a set of indices u>, 
we will use the subscript u> to denote its restriction to the coordinates/rows in 
ui. For example, letting uij denote the indices of the nonzero rows of the i th 
column of ft, then x u>i e l rtl and S c K r + 1 denote the restrictions of the 
i th column in X and S*, to the indices in u q. We say that an r-dimensional 
subspace S fits Xfj if x 0Ji e S UJi Vi. 

The Variety S 

Let us start by studying the variety of all r-dimensional subspaces that fit X^. 
First observe that in general, the restriction of an r-dimensional subspace to 
£ < r coordinates is Rb We formalize this in the following definition, which 
essentially states that a subspace is non-degenerate if its restrictions to £ < r 
coordinates are R f . 

Definition 1 (Degenerate subspace). We say S € Gr(r,R d ) is degenerate if 
and only if there exists a set uc{l,...,d} with |cu| < r, such that dimS' a , < |u>|. 

Let vq denote the uniform measure on Gr(r,R d ). A subspace is degenerate 
if and only if an r x r submatrix of one of its bases is rank-deficient. This is 
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equates to having a zero determinant. Since the determinant is a polynomial in 
the entries of a matrix, this is a condition of ^G-measure zero. 

Since j'G-almost every subspace is non-degenerate, let us consider only the 
subspaces in Gr*(?’,R d ) c Gr(r,R d ), the set of all non-degenerate r-dimensional 
subspaces of R d . 

Define S(Xq) c Gr*(r,R d ) such that every S e § (Xo) fits Xn, i.e., 

S(X n ) := {S' € Gr*(r,R d ) : e S Ul }? = ,}. 

Let U e Kr xr be a basis of S e §(Xn). The condition x lAli e S Ui is equivalent 
to saying that there exists a vector 9i e R r such that 

x Uj = V Ul 0i. (5) 

We can see that if x CJi has fewer than r observations, (5) will be an underde¬ 
termined system with infinitely many solutions, and hence x uli can be completed 
in infinitely many ways. 

If x UJi has exactly r observations, (5) becomes a system with r equations and 
r unknowns (the elements of 0 , ) . This will be the case for every S € Gr*(r,R d ). 
Hence a column with exactly r observations can be uniquely completed once S* 
is known, but it provides no information to identify S*. 

On the other hand, if x u , i has exactly r + 1 observations, then (5) becomes 
an overdetermined system with r + 1 equations and r unknowns. This imposes 
one constraint on the elements of U Wj , thus restricting the set of subspaces that 
fit x uli . 

In general, each column with r + 1 observations will impose one constraint 
that may reduce one of the r(d-r) degrees of freedom in Gr*(r,M d ). Therefore, 
one necessary condition for completion is that Xo imposes at least r(d - r ) 
constraints. 

We will now study these constraints and characterize when exactly will they 
reduce all the r(d-r) degrees of freedom in Gr*(r,R' i ), thus restricting S(Xn) 
to a set with at most finitely many elements. 

Let {Aj, v;} be a partition of the r + 1 elements of u>i, such that Aj has 
exactly r elements, and v,: has only one element. We can then expand (5) as 


II 

x Ai 

= 

Ua, 

{1 

_ X Vi_ 


[ U v , J 


Since S is non-degenerate, U Ai is full-rank, so we may solve for di using the 
top block to obtain 0; = U A |x Ai . Plugging this on the last row, we have that 
(5) is equivalent to: 

x Vl = U v ,U^x Ai . (6) 
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On the other hand, x u , i lies in S by assumption. This implies that there 
exists a unique 0 * e R r such that 


x., = 

where U* is a basis of S*. Substituting (7) in (6) we obtain 

u* e* = d*. 


(7) 

( 8 ) 


Recall that V A ) = Ui./|U Al |, where U A . and |U A J denote the adjugate 
and the determinant of U A .. Therefore, we may rewrite (8) as the following 
polynomial equation: 


(|U A jU* Vi -U Vi Ui,U* Ai )0* = 0. 


(9) 


We conclude that a subspace S with basis U fits X^ if and only if U satisfies 
(9) for every i = l,...,N. 

Since every nontrivial subspace has infinitely many bases, even if there is 
only one r-dimensional subspace in S(Xo), the variety 

{ U e R d * r : (|U A JU; i -U Vi Ul,Ul i )0* = O V « } 

has infinitely many solutions. Therefore, we will associate a unique U with 
each subspace as follows. Observe that for every S € Gr*(r,M d ), we can write 
S - span{U} for a unique U in the following column echelon form: 



U = 



( 10 ) 


On the other hand, every V e R( d-r ) xr defines a unique r-dimensional sub¬ 
space of R d , via spanjU). Moreover, span{U) will be non-degenerate for al¬ 
most every V, with respect to ia/: the Lebesgue measure on R( d-r ) x r_ L e t 
r)xi c xr denote the se t 0 f all (d - r) x r matrices V whose span{U) 

is non-degenerate, or equivalently, whose r x r submatrices of U are full-rank. 
Then we have a bisection between Gr*(r,R d ) and r ^ xr via S* = span{U). 
It follows that a statement holds for {vq x ^©{-almost every pair {S'*,©*} if 
and only if it holds for (uy x IJ @{-almost every pair {V*, ©*}. We will use these 
measures interchangeably. 


The Set T 

Continuing with our analysis, recall that a subspace S with basis U will fit Xfj 
if and only if U satisfies (9) for every i. With this in mind, define 

/i(V|V*,0*) := (|U A jU* Vi -U v M,U* Ai )0*, 
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with U and U* in the column echelon form in (10). We will use /,; as shorthand, 
with the understanding that f t is a polynomial in the elements of V, and that 
the elements of V* and 6* play the role of coefficients. 

Furthermore, let 

JF(V|V*,©*) := {/<}£,, 

and use 1F(V), or simply IF as shorthand, with the understanding that IF is a 
set of polynomials in the elements of V, and that the elements of V* and ©* 
play the role of coefficients. We will also use IF = 0 as shorthand for {/,; = 0}. 
This way, we may rewrite: 

S (Xn) = jspan 

In general, the affine variety 

V(T) := jv e R{ d ~ r)xr : T(V) = o} 

could contain an infinite number of elements. We are interested in conditions 
that guarantee there is only one or (slightly less demanding) only a finite num¬ 
ber. The following lemma states that this will be the case if and only if r(d-r) 
polynomials in IF are algebraically independent. 

Lemma 2. Let A1 hold. For a.e. X, S(Xo) contains at most finitely many 
subspaces if and only ifr(d-r) polynomials in T are algebraically independent. 

Proof. By our previous discussion, for a.e. X there are at most finitely many 
subspaces in S(Xo) if and only if there are at most finitely many points in 
V(fF). We know from algebraic geometry that this will be the case if and only 
if dim V(tF) = 0 (see, e.g., Proposition 6 in Chapter 9, Section 4 of [19]). 

Since V(!F) c M.{ d '-* xr , we know that if dimV(lF) = 0, then 2f must contain 
r(d-r) algebraically independent polynomials (see, e.g., Exercise 16 in Chapter 
9, Section 6 of [19]). 

On the other hand, we know that dimV(fF) = 0 if r(d-r) polynomials in IF 
are a regular sequence (see, e.g., Exercise 8 in Chapter 9, Section 4 of [19]). 

Finally, since being a regular sequence is an open condition, it follows that for 
(ia/xi/q )-almost every {V*, ©*}, polynomials in 2f are algebraically independent 
if and only if they are a regular sequence (see, e.g., Remark 3.4 in [20]). □ 

Remark 2. The next part of our analysis studies conditions to guarantee that 
the polynomials in fF are algebraically independent. Following up on Remark 
1, any observation, in addition to the r + 1 per column that we assume, cannot 
increase the number of subspaces that agree with the observations. In effect, each 
observed entry, in addition to the first r + 1 observations, places one additional 
polynomial constraint analogous to fi. However, the polynomials produced by 
the same column share the same coefficient 9 *. Intuitively, this means that the 
polynomials are no longer generic. While these polynomials might or might not 
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be algebraically dependent, in general it is difficult to determine which is the 
case. 

For this reason we assume A1: that each column is observed on exactly r +1 
entries. This way the r + 1 entries in each column produce only one polynomial 
constraint. This guarantees that we only use one polynomial per column, so that 
all the coefficients of the polynomials in S are generic, and easier to study. In 
general, if some columns are observed on more than r + 1 entries, all we need 
is that the observed entries contain a pattern with exactly r + 1 observations per 
column satisfying our sampling conditions. 

Algebraic Independence 

By the previous discussion, there are at most finitely many r-dimensional sub¬ 
spaces that fit Xn if and only if there is a subset S of r(d - r) polynomials in 
S that is algebraically independent. 

Whether this is the case depends on the supports of the polynomials in 
lb, i.e., on O: the subset of columns in 17 corresponding to such polynomials. 
Lemma 3 shows that the polynomials in S will be algebraically independent if 
and only if 17 satisfies the conditions in Theorem 1. 

Lemma 3. Let A1 hold. For a.e. X, the polynomials in S are algebraically 
dependent if and only if n(ft') > r(m(ft') - r) for some matrix ft’ formed with 
a subset of the columns in ft. 

To show this statement we will use Lemmas 4 and 5 below. 

Let ft' be a subset of the columns in $7, and let 'J' be the subset of the n(ft') 
polynomials in S corresponding to such columns. Notice that ‘S' only involves 
the variables in U corresponding to the m(fl') nonzero rows of ft'. 

Let K(fi') be the largest number of algebraically independent polynomials 
in ‘S'. 

Lemma 4. For a.e. X, K(f7') < r(mffl') -r). 

Proof. Observe that the column echelon form in (10) was chosen arbitrarily. As 
a matter of fact, for every permutation of rows II and every S e Gr*(r,R rf ), 
we may write S = span{U), for a unique U in the following permuted column 
echelon form: 


U 



For example, we could take II to swap the top and bottom blocks in (10), and 
take U in the following form: 


U 


= n 
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Observe that in general, U, V and lb will be different for each choice of II. 
Nevertheless, the condition x Wi e S Ui is invariant to the choice of basis of S. 
This implies that while different choices of II produce different lb’s, the variety 

S(X n ) = |span II ^ e Gr*(r,R d ) : lb(V) = 0 

is the same for every II. 

This implies that the number of algebraically independent polynomials in lb' 
is invariant to the choice of II. Therefore, showing that Lemma 4 holds for one 
particular II suffices to show that it holds for every II. 

With this in mind, take II such that U is written with the identity block in 
the position of r nonzero rows of ft'. 

Since the polynomials in J' only involve the elements of the m{fl') rows of 
U corresponding to the nonzero rows of 12 , and U has the identity block in 
the position of r nonzero rows of 12 , it follows that the polynomials in lb' only 
involve the r(m(f2 , ) - r) variables in the m(^2 , ) - r corresponding rows of V. 
Furthermore, lb' = 0 has at least one solution. This implies < r(m(Sl')-r), 

as desired. □ 

We say lb' is minimally algebraically dependent if the polynomials in lb' are 
algebraically dependent, but every proper subset of the polynomials in 3" is 
algebraically independent. 

Lemma 5. For a.e. X, ifj' is minimally algebraically dependent, then n(ft') = 
r(m(I2') - r) + 1. 

In order to prove Lemma 5 we will need the next two lemmas. 

Lemma 6. Take II such that U A . = U^. = I. For a.e. X, if T' = {lb", /,;} is 
minimally algebraically dependent, then all solutions to lb' = 0 satisfy U Vi = U* .. 

The intuition behind this lemma is as follows: suppose for contrapositive that 
there are infinitely many solutions to lb" = 0 with U Ai = I. Each of these 
solutions defines a different subspace. Since {lb", /,} is minimally algebraically 
dependent, a.e. solution to lb" must fit x Wi . This will only happen if x w . lies 
in the intersection of infinitely many r-dimensional subspaces, which is at most 
(i—l)-dimensional. But since x Wi is drawn from S* (an r-dimensional subspace), 
we know that almost surely x Wi will not lie in such (r-l)-dimensional subspace. 

Proof. Suppose that lb' = {lb", /,} is minimally algebraically dependent, and let 
v, denote the row of V corresponding to U Vi , such that /,; simplifies into 

/i( v i,u Ai |v*,0*) = (|u Ai |v;-v j ul.ui 1 )e. 

= (v*-Vi)0*. 
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Since _/) involves v,, T" must contain at least one polynomial in v$ (otherwise 
T' cannot be minimally algebraically dependent). This means that S" contains 
at least one polynomial fj involving vy 

fMu u Aj |v*,e*) = (lu^K-viU^u:^;. 

For a.e.X, 0* is independent of 0*, so {v\ x ^e)-almost surely, /) £ fj. 

We want to show that if T' is minimally algebraically dependent, then v, = 
v* is the only solution to J' = 0. So define v, =: [v*i Vj 2 ], and assume for 
contradiction that there exists a solution to S" - 0 with v, 2 = 7 * V i2 and 
U Aj = Tj, that is also a solution to S' = 0 . 

Next consider the univariate polynomials in Vji evaluated at this solution: 


9 i(yn |V*,0*) 
9i(v fl |V*,0*) 


/i(v il ,v i2 ,U Ai |V*,0*)| 

l v i 2 = 7 .U A ,=I 

/,(va,v i2 ,U A .|V*,0*)| 

'vj 2 = 7 ,u A =r 


and observe that since { 7 , 1 ^} are a solution to S ', then gt and gj must have a 
common root. 

We know from elimination theory that two distinct polynomials g.-,. gj have a 
common root if and only if their resultant Res (gi,gj) is zero (see, for example, 
Proposition 8 in Chapter 3, Section 5 of [19]). 

But Res(gi,gj) is a polynomial in the coefficients of g % and gj. In other 
words, Res (gi, 9 j) = hiysoio]) for some nonzero polynomial h in V*, 0* 
and 9*. Therefore, h + 0 for ( ia/ x ^e)-almost every {V*, 0*} (since the variety 
defined by h = 0 has measure zero). Equivalently, h + 0 for a.e. X. Since 
Res (gi,gj) t 0, it follows that g t and gj do not have a common root v,i, which 
is the desired contradiction. 

This will be true for either almost every 7 in an infinite collection, or for 
every 7 in a finite collection. In the first case, we would conclude that S' - 0 
has infinitely fewer solutions than S" = 0, in contradiction to the minimally 
algebraically dependent assumption. In the second case, we conclude that v * 2 
is the only solution to S' = 0 . 

Since v^i was an arbitrary entry of U w ., we conclude that for a.e. X, if S' 
is minimally algebraically dependent, then U v . = U* . is the only solution to 
S' = 0, as desired. □ 


Define {Vt, V^} as the partition of the variables involved in the polynomials 
in S' t c S', such that all the variables in V* are uniquely determined by S' = 0 . 

Lemma 7. Suppose Vt + 0 and that every /,: e S' t is a polynomial in at least 
one of the variables in V t . Then for a.e. X, all the variables involved in S' t are 
uniquely determined by S' = 0 . 

Proof. Let v c be one of the variables in and let /) be a polynomial in S[ 
involving v c . By assumption on S’ tl ft also involves at least one of the variables 
in Vt, say v. 
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Let w denote the set of all variables involved in /,; except v. Observe that 
v c e w. This way, /* is shorthand for /j(v,w|V*, 0 *). 

We will show that for a.e. X, all the variables in w are also uniquely deter¬ 
mined by T' = 0 . 

Suppose there exists a solution to 'S' - 0 with w = 7, and define the univari¬ 
ate polynomial 

</(v|V\ 0 *) := /i(v,w|V*,0*)| 

iw=7 

Now assume for contradiction that there exists another solution to 2r' - 0 with 
w + 7. Let w = 7' be an other solution to r J' = 0, and define 

</(v|V*, 0 *) := /i(v,w|V*, 0 *)| , . 

1 w=7 

We will first show that g + g'. To see this, recall the definition of /*, and 
observe that it depends on the choice of v,:- Nevertheless, it is easy to see that 
fi = 0 describes the same variety regardless of the choice of v»- Intuitively, this 
means that even though /, might look different for each choice of v>, it really 
is the same. 

Therefore, we may select Vi to be the element of u 7 corresponding to the 
position of a variable of w that takes different values in 7 and 7'. This way, 
a variable with multiple solutions is located in the location of U Vi . Since /,; is 
linear in U Vi , it follows that g + g' for {vy x j/e)-almost every {V*. 0*}. 

Now observe that since v is uniquely determined by 3 r/ = 0, g and g' have a 
common root, which immediately implies that there are at most finitely many 
distinct g'. Otherwise, v would be a common root to infinitely many distinct 
polynomials, which (vy x i/e)-almost surely cannot be the case. 

We know from elimination theory that two distinct polynomials g,g' have a 
common root if and only if their resultant Res(g, g') is zero (see, for example, 
Proposition 8 in Chapter 3, Section 5 of [19]). 

But Res(g, g') is a polynomial in the coefficients of g and g'. In other words, 
Res (g,g') = h(V*,e* i ) for some nonzero polynomial ft in V* and 6* . Therefore, 
ft + 0 for (uy x uq )-almost every {V*, ©*} (since the variety defined by ft = 0 
has measure zero). Equivalently, ft + 0 for a.e. X. 

Since Res (g,g') + 0, it follows that g and g' do not have a common root v, 
which is the desired contradiction. This is true for all of the finitely many g'. 
This shows that for a.e. X, all the variables in w (including v c ) are uniquely 
determined by 2r' = 0. 

Since v c was an arbitrary element in V£, we conclude that all the variables 
in VJ are uniquely determined by ‘S' = 0. □ 

With this, we are now ready to present the proofs of Lemma 5, Lemma 3 
and Theorem 1. 

Proof. (Lemma 5) By the same arguments as in Lemma 4, whether 9is mini¬ 
mally algebraically dependent is invariant to any permutation II of the rows of 
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the column echelon form in (10). Therefore, showing that Lemma 5 holds for 
one particular choice of II suffices to show it holds for every II. 

With this in mind, suppose S' = {S". /,;} is minimally algebraically depen¬ 
dent. Take II such that U and U* are written in the column echelon form in 
(10) with the identity block in the rows indexed by A i; and let denote the 
row of V corresponding to U Vi , such that 



L—J}i- 


We know by Lemma 6 that V; is uniquely determined by S' = 0 . We will 
now iteratively use Lemma 7 to show that all the variables in S' (which are the 
same as the variables in S") are also uniquely determined by S' - 0. This will 
imply that all the variables in S" are finitely determined by S" - 0, and that S" 
contains the same number of polynomials, as variables, r(m(£l") - r), 

which is the desired conclusion. 

First observe that since V; is finitely determined by S" = 0, S" must contain 
at least r polynomials in V;. Denote these polynomials by S' x c S". 

We will proceed inductively, indexed by t > 1. First, set t = 1 and define 
Vi = {vi}. We showed above that the variables in Vj are uniquely determined 
by S' - 0 . Suppose that S[ involves some variables other than those in Vi. 
Note that every polynomial in S[ involves at least one of the variables in Vi. 
Let V 2 be the set of all variables involved in By Lemma 7, all the variables 
in V 2 are uniquely determined by S' = 0 . 

We will now proceed inductively. For any t > 2, let V t be a subset of rq 
variables in V. Assume that all the variables in V t are uniquely determined 
by S' - 0 . Since dimV(3 r ") = dimV(T'), it follows that all the variables in 
V t are finitely determined by S" - 0. It follows that S" must contain at least 
rit algebraically independent polynomials, each involving at least one of the 
variables in V t . Let S' t be this set of polynomials. Suppose S' t involves some 
variables other than V t . Define V t+ i to be the set of all variables involved in 
S' t . By Lemma 7, all the variables in V t+ i are uniquely determined by S' = 0. 

Since this is true for every t , and there are finitely many variables, this 
process must terminate at some finite step T, at which point S' T is a set of tit 
algebraically independent polynomials in tit variables. 

This means that all the variables in S' T are finitely determined by S' T = 0 , 
and since fi only involves a subset of the variables in S ' T , it follows that the 
polynomials in {S' T ,fi} c S' are algebraically dependent. Furthermore, since S' 
is minimally algebraically dependent by assumption, we have that S' T = S". 

Finally, observe that S" contains n(f l") polynomials in - r ) vari¬ 

ables. Since S" = S ' T , and S' T has ut polynomials in tit variables, it follows 
that n(Cl") = - r), as desired. □ 
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Proof. (Lemma 3) 

(=>) Suppose P' is minimally algebraically dependent. By Lemma 5, n(Cl') = 
r(m{Cl )-?’) + 1 > r(m(Cl ) -r ), and we have the first implication. 

(-<=) Suppose there exists an Cl' with n(Cl') > r(m(Cl') -r). By Lemma 4, 
?7(O') > which implies the polynomials in P', and hence P. are 

algebraically dependent. 


□ 


Proof. (Theorem 1) 

(=>) Suppose for contrapositive that for every fl formed with r(d — r ) columns 
of O, there exists an Cl' formed with a subset of its columns such that 
m(Cl') < n(Cl')/r + r. Lemma 3 implies that the polynomials in P'. and 
hence P, are algebraically dependent. It follows by Lemma 2 that there 
are infinitely many subspaces in S(Xfj). 

(-<=) Suppose that for some Cl formed with r^d - r ) columns of O, every Cl' 
formed with a subset of the columns in Cl satisfies m(Cl') > n(Cl')/r + r, 
including Cl. By Lemma 3, the r(d-r) polynomials in P are algebraically 
independent. It follows by Lemma 2 that there are at most finitely many 
subspaces in S(Xfj), hence at most finitely many rank -r completions of 
Xn. 

□ 

5 Proof of Theorem 2 

In this section we give the proof of Theorem 2. We will use CX.^ and Xjj to 
denote the d x r{d - r) and d x (d - r ) submatrices of Xn corresponding to O 
and Cl. In addition, let and denote the i th columns of Cl and X^. 

In order to prove Theorem 2, we will require Theorem 1 in [21], which we 
state here as the following lemma, with some minor adaptations to our context. 

Lemma 8. Suppose Cl is a dx (d-r) matrix with binary entries for which (ii) 
holds and let S e Gr(r,R d ). Then for VQ-almost every S*, { S = SC_ }W[ if 
and only if S = S*. 

With this, we are ready to give the proof of Theorem 2. 

Proof. (Theorem 2) Suppose $7 contains two disjoint matrices Cl and $7 satisfy¬ 
ing the conditions of Theorem 2. 

Since Cl satisfies (i), by Theorem 1 there are at most finitely many r- 
dimensional subspaces that fit Xj^. Equivalently, the set P, containing the 
r(d - r) polynomials defined by the columns in Xjj, is algebraically indepen¬ 
dent. Let fi be the polynomial defined by x* 4 . It follows that the set {P, /J is 
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algebraically dependent. Let 3" be a subset of the polynomials in 3, such that 
3' = {3", /,;} is minimally algebraically dependent. Then any subspace S with 
basis U that fits must satisfy 3' = 0, implying by Lemma 6 that = U£ . 

Therefore, every S that fits both Xjj and must satisfy (S&. = S^.}fj 1 r . 
Since Cl satisfies (ii), it follows by Lemma 8 that S = S*. □ 

In Section 2 we mentioned that there are cases where r(d-r) columns with 
only r +1 samples are sufficient for unique completability. The next result states 
that this is indeed the case if r = 1. 

Proposition 1. If r = 1, finite completablility is equivalent to unique com- 
pletability. 

Proof. Assume r = 1. Then U A . and U Vi are scalars, so f-, simplifies into: 

fi = (u A ,u* Vi - u V! u* Ai ) 0 *. 

This implies that 3 = 0 is a system of linear equations, hence if it has finitely 
many solutions, it has only one. □ 

In Section 2 we also mentioned that in general, strictly more than r(d - r ) 
columns with only r + 1 samples are necessary for unique completability. We 
would like to close this section with an example where this is the case. 

Example 5. Consider d = 4 and r = 2, such that N = r{d - r) =4. Let 

1110 
110 1 
10 11 
0 111 

It is easy to see that that Cl = Cl satisfies the conditions of Theorem 1. One may 
also verify (for example, solving explicitly T(V) = Oj that for a.e. X there exist 
two subspaces that fit Xo. 

As a matter of fact, this will also be the case for any permutation of the rows 
and columns of this matrix. One may construct similar samplings with the same 
property for larger d and r. All this to say that this is not a singular pathological 
example; there are many samplings that cannot be uniquely recovered with only 
r(d - r) columns with r + 1 samples. 

6 Additional Proofs 

In this section we present the proofs of Lemma 1 and Theorem 3. The proof 
of Corollary 1 follows directly from Lemma 1 and Theorem 2, and the proof of 
Corollary 2 follows directly from Theorem 1 in [21]. 
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Proof. (Lemma 1) Suppose Cl contains disjoint matrices {Cl T } 7 T = x satisfying the 
conditions of Lemma 1. Let Cl' be a matrix formed with a subset of the columns 
in Cl. Then Cl' = [fl( ••• for some matrices = 1 formed with subsets of 

the columns in {fi T }(l = 1 . 

It follows that 


n(Cl') = ^ n(Cl ' T ) < ^ maxn(fl(). 

T = 1 T = 1 T 


Assume without loss of generality that this maximum is achieved when r = 1. 
Then 


n(Cl') < rn(Cl[) < r{m{Cl' 1 )-r) < r(m(Cl')-r), 

where the last two inequalities follow because (2) holds for every Cl' T by assump¬ 
tion, and because m(Cl') > m(Cl' T ) for every r. 

Since Cl' was arbitrary, we conclude that (1) holds for every matrix Cl' formed 
with a subset of the columns in ST □ 

The following lemma shows that (ii) is satisfied with high probability un¬ 
der uniform random sampling schemes with only O(max{r,logd}) samples per 
column. 

Lemma 9. Let the assumptions of Theorem 3 hold, and let Cl be a matrix 
formed with d-r columns of Cl. With probability at least 1 - 4, Cl will satisfy 

(ii). 

Proof. Let £ be the event that m.(Cl') < n(Cl') +r for some matrix Cl' formed 
with a subset of the columns in Cl. It is easy to see that this will only occur if 
there is a matrix Cl' formed with n columns of Cl that has all its nonzero entries 
in the same n + r-1 rows. Let £,, denote the event that the matrix formed with 
the first n columns from has all its nonzero entries in the first n + r-1 rows. 
Then 


P(£) < 


d - 

I 


[ d ~ r ) 

l d ) 

In/ 

\n + r - 1/ 


) P (£0 


( 11 ) 


If each column of $1 contains at least i nonzero entries, distributed uniformly 
and independently at random with i as in (3), it is easy to see that P(£ n ) = 0 
for n<t-r , and for i-r<n<d-r , 


P (£„) < 
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Since ( d n r ) < ( n+ ^_ 1 ), continuing with (11) we obtain: 



n/ ^ , ^" r / d \ 2 (n + r-l\ tn 

< „ = €?r + i W + r-i) ( d ) 



y 2 ( d ) 2 ( n Y (nr+1) 

In/ Id/ 

(12) 


V( d ) 2 ( d -n\ eid ' n ~ r + 1) 
n = lId - nj Id/ 

(13) 

For the terms in 

(12), write 



/d\ 2 /n\^ n - r + 1 ) < /de\ 2n /ny( n - r + 1 ) 

\nj V d / \n/ \d/ 

(14) 

Since n>l> 2r, 

<de\ 2 n (n\ *9 0 <n\ { *- 2)n 

(14) < - H = e 2 ” ^ 

\ n / V a / \ a / 

(15) 

and since n < | , 

/1 \ / \ n 

(I 5 )se -(I) . (e’.rt**) < 

(16) 


where the last step follows because ^ > 21og 2 (^^-) +4. 
For the terms in(13), write 


d \ 2 /d-n\ e ( d ~ n - r+1) < (de\ 
d-n) \ d / In/ 

In this case, since 1 < n < | and r < we have 

(17) < {der (^f - <*)-[(!-=)f 

< {de) 2 n [e~ n Y , 



which we may rewrite as 

(e 21 ^)" (e 2 )” (e-i)' 1 = < 

,2 

where the last step follows because t > 3 log( —) + 6 log d + 6. 

Substituting (16) and (18) in (12) and (13), we have that P(£) < 

We are now ready to give the proof of Theorem 3. 


(18) 

□ 
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Proof. (Theorem 3) If N > r(d-r), randomly select disjoint matrices {fi T }(l = 1 , 
each formed with d — r columns of Cl . 

Union bounding over r, we may upper bound the probability that Cl fails to 
satisfy the conditions of Lemma 1 by 


r 


r 


£ p (£) < I 


e 

d 


< 


r 


£ 


e 

r 


e. 


The first part of the statement follows because the conditions in Lemma 1 
imply the conditions in Theorem 1. 

If N > (r + l)(d-r), randomly select disjoint matrices {f2 r }(lt each formed 
with d - r columns of Cl. By the same arguments, the probability that Cl fails 
to satisfy the conditions of Theorem 2 is upper bounded by: 


E p<£> < E i 

T = 1 T = 1 a 


< £ 


T= 1 ^ + 1 


e. 


□ 


7 Conclusions 

In this paper we give sampling conditions for finite rank-r completability, that 
is, conditions on the set of observed entries to guarantee that a matrix can 
be completed in at most finitely many ways. We also provide deterministic 
sampling conditions for unique completability that can be efficiently verified. In 
addition, we show that uniform random samplings with only 0(max{r,logd}) 
observed entries per column satisfy these conditions with high probability. These 
findings have several implications on LRMC regarding lower bounds, sample 
and computational complexity, the role of coherence, adaptive settings and the 
validation of any completion algorithm. 
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We would like to thank Louis Theran for pointing out a mistake in a previ¬ 
ous version of the paper. In that earlier version we erroneously assumed that 
columns with more than r +1 observed entries would yield multiple independent 
constraints. However, as Theran pointed out through the following example in 
[11], these constraints may be algebraically dependent. For this reason, in our 
current analysis we use only one constraint per column. 

Example 6. Suppose X is a rank-2 matrix observed on the entries indicated by 
l’s 


Cl 


'1110 0 ' 
1110 0 
110 11 
0 0 111 
0 0 111 
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Here X 3 is observed on r + 2 entries. Using Definition 1 in our earlier version 
of this paper, X 3 yields the two central columns of $7 


n 


'11110 0 ' 

111100 

110011 

001011 

000111 


The two central columns in $7 correspond to CI 3 , and encode the two polynomial 
constraints obtained from the observed entries in X3. 

While the sampling Cl satisfies the completability conditions of Theorem 1 
in our earlier version of this paper, X cannot be completed. This is because the 
two polynomials defined by X 3 share the same coefficients, which makes them 
algebraically dependent. If instead ofx .3 we observed two columns on the entries 
indicated by CI 3 , then we would also obtain two polynomial constraints. Only 
these polynomials would have generic coefficients, and would be algebraically in¬ 
dependent with probability 1. In fact, a matrix observed on the entries indicated 
in Cl (or more) can indeed be completed. 
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